{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Income Prediction Explanations\n",
    "\n",
    "![Census](https://archive.ics.uci.edu/ml/assets/MLimages/Large2.jpg)\n",
    "\n",
    "We will use an SKLearn classifier built on the [1996 US Census DataSet](https://archive.ics.uci.edu/ml/datasets/adult) which predicts high (>50K$) or low (<=50K$) income based on the Census demographic data. \n",
    "\n",
    "The Kfserving resource provdes:\n",
    "   * A pretrained sklearn model stored on a Google bucket\n",
    "   * A pretrained Tabular [Seldon Alibi](https://github.com/SeldonIO/alibi) Explainer. The training has taken samples of the training data and stored the categorical mapping to allow for human readable results. See the [Alibi Docs](https://docs.seldon.io/projects/alibi/en/stable/) for further details of training and setting up a model explainer for your data.\n",
    "   \n",
    "** For users of KFServing v0.3.0 please follow the [notebook for that branch](https://github.com/kubeflow/kfserving/blob/v0.3.0/docs/samples/explanation/alibi/income/income_explanations.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -r requirements.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pygmentize income.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl apply -f income.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "CLUSTER_IPS=!(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')\n",
    "CLUSTER_IP=CLUSTER_IPS[0]\n",
    "print(CLUSTER_IP)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SERVICE_HOSTNAMES=!(kubectl get inferenceservice income -o jsonpath='{.status.url}' | cut -d \"/\" -f 3)\n",
    "SERVICE_HOSTNAME=SERVICE_HOSTNAMES[0]\n",
    "print(SERVICE_HOSTNAME)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append('../')\n",
    "from alibi_helper import *\n",
    "from alibi.datasets import fetch_adult\n",
    "adult = fetch_adult()\n",
    "cmap = dict.fromkeys(adult.category_map.keys())\n",
    "for key, val in adult.category_map.items():\n",
    "    cmap[key] = {i: v for i, v in enumerate(val)}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "idxLow = 0\n",
    "idxHigh = 32554\n",
    "for idx in [idxLow,idxHigh]:\n",
    "    show_row([getFeatures([adult.data[idx]], cmap)],adult)\n",
    "    show_prediction(predict(adult.data[idx:idx+1].tolist(),\"income\",adult,SERVICE_HOSTNAME,CLUSTER_IP))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get Explanation for Low Income Prediction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "exp = explain(adult.data[idxLow:idxLow+1].tolist(),\"income\",SERVICE_HOSTNAME,CLUSTER_IP)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_anchors(exp['data']['anchor'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Show precision. How likely predictions using the Anchor features would produce the same result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_bar([exp['data']['precision']],[''],\"Precision\")\n",
    "show_bar([exp['data']['coverage']],[''],\"Coverage\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_feature_coverage(exp['data'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_examples(exp['data'],0,adult)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "show_examples(exp['data'],0,adult,False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get Explanation for High Income Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "exp = explain(adult.data[idxHigh:idxHigh+1].tolist(),\"income\", SERVICE_HOSTNAME,CLUSTER_IP)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_anchors(exp['data']['anchor'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Show precision. How likely predictions using the Anchor features would produce the same result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_bar([exp['data']['precision']],[''],\"Precision\")\n",
    "show_bar([exp['data']['coverage']],[''],\"Coverage\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_feature_coverage(exp['data'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_examples(exp['data'],0,adult)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "show_examples(exp['data'],0,adult,False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Teardown"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl delete -f income.yaml"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
